vulnerability disclosure
Managing the Cybersecurity Vulnerabilities of Artificial Intelligence
Last week, Andy Grotto and I published a new working paper on policy responses to the risk that artificial intelligence (AI) systems, especially those dependent on machine learning (ML), can be vulnerable to intentional attack. As the National Security Commission on Artificial Intelligence found, "While we are on the front edge of this phenomenon, commercial firms and researchers have documented attacks that involve evasion, data poisoning, model replication, and exploiting traditional software flaws to deceive, manipulate, compromise, and render AI systems ineffective." The demonstrations of vulnerability are remarkable: In the speech recognition domain, research has shown it is possible to generate audio that sounds like speech to ML algorithms but not to humans. There are multiple examples of tricking image recognition systems to misidentify objects using perturbations that are imperceptible to humans, including in safety critical contexts (such as road signs). One team of researchers fooled three different deep neural networks by changing just one pixel per image.
Predicting Exploitation of Disclosed Software Vulnerabilities Using Open-source Data
Bullough, Benjamin L., Yanchenko, Anna K., Smith, Christopher L., Zipkin, Joseph R.
Each year, thousands of software vulnerabilities are discovered and reported to the public. Unpatched known vulnerabilities are a significant security risk. It is imperative that software vendors quickly provide patches once vulnerabilities are known and users quickly install those patches as soon as they are available. However, most vulnerabilities are never actually exploited. Since writing, testing, and installing software patches can involve considerable resources, it would be desirable to prioritize the remediation of vulnerabilities that are likely to be exploited. Several published research studies have reported moderate success in applying machine learning techniques to the task of predicting whether a vulnerability will be exploited. These approaches typically use features derived from vulnerability databases (such as the summary text describing the vulnerability) or social media posts that mention the vulnerability by name. However, these prior studies share multiple methodological shortcomings that inflate predictive power of these approaches. We replicate key portions of the prior work, compare their approaches, and show how selection of training and test data critically affect the estimated performance of predictive models. The results of this study point to important methodological considerations that should be taken into account so that results reflect real-world utility.